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Triaxial orbit based galaxy models with an application to the 
(apparent) decoupled core galaxy NGC 4365 
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ABSTRACT 

We present a flexible and efficient method to construct triaxial dynamical models of galaxies 
with a central black hole, using Schwarzschild's orbital superposition approach. Our method 
is general and can deal with realistic luminosity distributions, which project to surface bright- 
ness distributions that may show position angle twists and ellipticity variations. The models 
are fit to measurements of the full line-of-sight velocity distribution (wherever available). We 
verify that our method is able to reproduce theoretical predictions of a three-integral triaxial 
Abel model. In a companion paper (van de Ven, de Zeeuw & van den Bosch), we demon- 
strate that the method recovers the phase-space distribution function. We apply our method 
to two-dimensional observations of the E3 galaxy NGC 4365, obtained with the integral-field 
spectrograph SAURON, and study its internal structure, showing that the observed kinemat- 
ically decoupled core is not physically distinct from the main body and the inner region is 
close to oblate axisymmetric. 
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1 INTRODUCTION 

Binney (1976, 1978) argued convincingly that elliptical galaxies 
may well have triaxial intrinsic shapes, based on the observed 
slow rotation of the stars (Bertola & Capaccioli 1975; Illingworth 
1977), the presence of isophote twists in the surface brightness 
distribution (e.g., Williams & Schwarzschild 1979), the presence 
of velocity gradients along the apparent minor axis ('minor-axis 
rotation', Schechter & Gunn 1978), and evidence from N-body 
simulations (Aarseth & Binney 1978). Schwarzschild's (1979, 
1982) numerical models demonstrated that such systems can 
be in dynamical equilibrium, and suggested that their observed 
kinematics can be rich (see also, e.g., Statler 1991). This is 
supported by the discovery of kinematically decoupled cores in 
the late nineteen-eighties (Bender 1988; Franx & Illingworth 
1988) and, more recently, by observations with integral-field 
spectrographs such as SAURON, which reveal that some ellipticals 
have point- symmetric rather than bi-symmetric velocity fields, and 
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often contain kinematically decoupled components (e.g., Emsellem 
et al. 2004). This means that these galaxies are not axisymmetric. 

Subsequent work on triaxial dynamical models focused 
mostly on models with a cusp in the central density profile, on the 
effect of a central black hole, and on the range of shapes for which 
triaxial models could be in dynamical equilibrium (e.g., Gerhard 
& Binney 1985; Levison & Richstone 1987; Statler 1987; Hunter 
& de Zeeuw 1992; Schwarzschild 1993; Merritt & Fridman 1996; 
Siopis 1998; Terzic 2002). With the exception of studies of the 
Galactic Bulge (Zhao 1996; Hafner et al. 2000), most of this work 
was restricted to finding (numerical) distribution functions consis- 
tent with a given triaxial density. This showed that many different 
distribution functions may reproduce the same triaxial density, and 
that these dynamical models all have different observable kinematic 
properties, but detailed comparison to observations received little 
attention (Arnold, de Zeeuw & Hunter 1994; Mathieu & Dejonghe 
1999). Ad-hoc kinematic models were used to constrain the distri- 
bution of intrinsic shapes (Binney 1985; Franx, Illingworth & de 
Zeeuw 1991) or of individual objects (Tenjes et al. 1993; Statler 
1994b, 2001; Statler, Dejonghe & Smecker-Hane 1999; Statler 
et al. 2004). 
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The possibility to measure accurate line-of-sight velocity dis- 
tributions (LOSVDs) in elliptical galaxies from observations of 
the stellar absorption lines (e.g., Bender 1990; van der Marel & 
Franx 1993), and the realization that these are the only way to 
distinguish radial variations in mass-to-light-ratio M/L from ra- 
dial variations in the anisotropy of the orbital structure (Dejonghe 
1987; Gerhard 1993), led to the development of detailed spheri- 
cal, and subsequently axisymmetric, numerical dynamical models 
aimed to fit all these kinematic measurements. These are gener- 
ally constructed with a variant of Schwarzschild's (1979) orbit su- 
perposition method, in which occupation numbers are found for a 
representative library of orbits calculated in the gravitational po- 
tential of the galaxy. The aim is to measure the mass of the central 
black holes (van der Marel et al. 1998; Gebhardt et al. 2003; Val- 
luri, Merritt & Emsellem 2004; Valluri et al. 2005; van den Bosch 
et al. 2006; Shapiro et al. 2006), to deduce the properties of dark 
halos (e.g., Rix et al. 1997; Cretton et al. 1999; Gerhard et al. 2001 ; 
Thomas et al. 2005), and to derive the internal orbital structure and 
intrinsic shape (Verolme et al. 2002; Cappellari et al. 2002, 2006; 
Krajnovic et al. 2005; van de Ven et al. 2006). Some galaxies dis- 
play significant signatures of non-axisymmetry, suggesting they are 
intrinsically triaxial. 

The logical next step is to construct realistic triaxial models, 
which fit the details of the observed surface brightness, including 
isophote twists, nuclear stellar disks and a central cusp, as well as 
the two-dimensional kinematic measurements. This is a non- trivial 
undertaking, as the parameter range to be explored for a given 
model is significantly larger than in axisymmetric geometry, and 
the internal dynamical structure is more complicated, as it includes 
four major orbit families, a host of minor families and chaotic or- 
bits. However, the ability to construct such models will make it 
possible to derive reliable intrinsic parameters for giant elliptical 
galaxies, and opens the way for a systematic exploration of their 
properties. In this paper, we describe a practical method for do- 
ing this, and report an application which accurately reproduces the 
two-dimensional kinematic measurements of the triaxial E3 galaxy 
NGC 4365, obtained with SAURON. In the companion paper (van 
de Ven et al. 2007, hereafter vdV07) we apply the method to analyt- 
ical triaxial three-integral models and show that it reliably recovers 
the input three-integral distribution function. 

We start with a short section on Schwarzschild's method (Sj2]l, 
which includes a brief overview of our implementation. We then 
give a step-by-step description of the main properties of our for- 
malism (§3}{5). In [J6]we test the method, including the ability to 
recover the global input parameters. We construct a triaxial model 
for NGC 4365 in S|7] and we summarize our conclusions in ij8] 



2 SCHWARZSCHILD'S METHOD 
2.1 Brief historical overview 

Schwarzschild's (1979) orbit superposition method is a flexible 
method to build dynamical models of early- type galaxies. The 
original implementation was aimed at reproducing a given triax- 
ial density distribution. Subsequently, it was applied to a large va- 
riety of density distributions, from spherically and axially sym- 
metric (Richstone 1980, 1982, 1984; Richstone & Tremaine 1984; 
Levison & Richstone 1985; Valluri et al. 2004) to triaxial shapes 
(e.g., Schwarzschild 1982, 1993; Vietri 1986; Levison & Richstone 
1987; Statler 1987; Merritt & Fridman 1996; Siopis & Kandrup 
2000; Hafner et al. 2000). 



Pfenniger (1984) showed that it is possible to include mea- 
surements of the mean line-of-sight velocity and the second veloc- 
ity moment, provided that the true second velocity moment {v 2 } is 
used and not the velocity dispersion a' 2 — (v 2 ) — (v) 2 . The reason 
for this requirement is that the dispersion depends quadratically on 
the first velocity moment and can therefore not be included in a 
linear orbit superposition method (but see Dejonghe 1989). Zhao 
(1996) used this principle to build triaxial models of the Galac- 
tic Bulge. At the same time, theoretical (Dejonghe 1987) and ob- 
servational (Franx & Illingworth 1988) investigations showed that 
LOSVDs are generally not Gaussian-shaped and higher-order ve- 
locity moments are required to describe the true profile. This stim- 
ulated the use of the so-called Gauss-Hermite (GH) moments (van 
der Marel & Franx 1993; Gerhard 1993). 

The first implementations of Schwarzschild's method that 
used additional kinematic information were designed for the mod- 
eling of spherical galaxies (Richstone & Tremaine 1984; Rix et al. 
1997). Orbits in these models obey four integrals of motion: the 
energy E and all three components of the angular momentum 
L = (L x ,L y ,L z ). While useful, this software was still of limited 
applicability, as most galaxies are not round, but axisymmetric or 
triaxial. Orbits in oblate axisymmetric galaxies conserve at least the 
two classical integrals E and L z (which is the component of the an- 
gular momentum along the short axis), while it has been known for 
a long time that most orbits in our Galaxy conserve an additional 
non-classical third integral of motion (e.g., Contopoulos 1960; Ol- 
longren 1962). A more general version of the Schwarzschild soft- 
ware was therefore developed to model axisymmetric galaxies with 
three-integral distribution functions (van der Marel et al. 1998; 
Cretton et al. 1999; Thomas et al. 2005). Results that were ob- 
tained with the extended Schwarzschild method indeed showed that 
the third integral is an essential ingredient of realistic axisymmetric 
galaxy models (van der Marel et al. 1998; Verolme et al. 2002), that 
we can derive information on the phase-space structure of galaxies 
(Cappellari et al. 2002; Krajnovic et al. 2005), that we can use the 
method to measure the mass of the central black hole in galaxies 
(Gebhardt et al. 2003) and that proper motion kinematic observa- 
tions can be used (van de Ven et al. 2006), provided that the models 
have sufficient internal freedom, e.g., the total number of orbits is 
large enough (Valluri et al. 2004; Thomas et al. 2004; Cretton & 
Emsellem 2004; Richstone et al. 2004; Magorrian 2006). 



2.2 Generalization to triaxial geometry 

The method described here uses many of the ideas and algorithms 
described in Rix et al. (1997), van der Marel et al. (1998), (), () and 
(). The computer program for triaxial geometry was written largely 
from scratch. 

The standard implementation of the extended Schwarzschild 
method starts from a surface brightness distribution, which we 
parametrise with a sum of Gaussians ( i|3.U . The intrinsic mass 
distribution and potential are then obtained by deprojecting the sur- 
face density, which requires a choice for the viewing angle(s) along 
which the object is observed (i ]3.3t . The potential calculation is out- 
lined in 33~8l 

In the potential, the initial conditions for a representative orbit 
library are found (Q. These orbital components must include all 
types of orbits that the potential supports, to avoid any bias (e.g., 
Thomas et al. 2004). 

Schwarzschild's method tries to find a steady-state model of 
a galaxy, requiring orbital building blocks to be time-independent. 
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We integrate the orbits for a fixed time of 200 times the period of a 
closed elliptical orbit with the same energy. 

During orbit integration, the intrinsic and projected properties 
are stored on grids, in order to allow for comparison with the data 
(23}. The quantities that will be compared to observations are spa- 
tially convolved with the same point spread function (PSF) as the 
observations. 

After orbit integration, the superposition of orbits whose prop- 
erties best match the observational data is determined. The superpo- 
sition can be constructed by using linear or quadratic programming 
(Schwarzschild 1979, 1982, 1993; Vandervoort 1984; Dejonghe 
1989), maximum entropy methods (Richstone & Tremaine 1988; 
Gebhardt et al. 2003; Thomas et al. 2004) or with a least-squares 
solver as was used in many of the axisymmetric three-integral im- 
plementations (Rix et al. 1997; van der Marel et al. 1998; Cappel- 
lari et al. 2006). Here we use the a quadratic programming solver as 
it finds the best-fitting superposition in a least squares sense, while 
allowing for additional contraints ( ^5.U . 



any arbitrary positive density distribution. However, MGE models 
can reproduce a large variety of densities, which appears realistic 
when projected along any viewing direction, including mass mod- 
els with radially varying triaxiality, multiple photometric compo- 
nents and disks. 

Accordingly, we write the triaxial MGE luminosity density as 



1 



x exp 



2a] 



(1) 



where N is the number of required Gaussian components, Lj is the 
luminosity of the jth Gaussian, pj and qj are the axial ratios, and <jj 
is the corresponding dispersion along the x-axis. Moreover, M/L 
is the mass-to-light ratio, and (x, y, z) is a system of coordinates 
centered on the common origin of the Gaussians and aligned with 
the common principal axes of the Gaussians. 



3 MASS PARAMETERIZATION, POTENTIAL AND 
ACCELERATIONS 

In this section, we describe the method that we use to obtain a triax- 
ial mass model from the observed surface brightness. We describe a 
convenient mass parametrisation and derive the corresponding po- 
tential and accelerations. A summary of symbols introduced in this 
section is given in Table[TJ 



3.1 The MGE parameterization 

In order to derive the intrinsic luminosity density from the observed 
galaxy surface brightness, a deprojection is required. For a spher- 
ical galaxy, this leads to a unique solution (Binney & Tremaine 
1987). This is not the case for an axisymmetric object, unless it is 
seen edge-on (Rybicki 1987). This non-uniqueness is even stronger 
for triaxial shapes, where the deprojection is not unique from any 
viewing direction (e.g., Gerhard 1996). For this reason, the assump- 
tion that an object is triaxial is not sufficient to uniquely recover the 
intrinsic luminosity density from an observed image, and additional 
assumptions have to be made. 

The simplest option is to assume that the intrinsic density is 
stratified on similar triaxial ellipsoids. The isophotes that are pro- 
duced by such a mass model are similar coaxial ellipses (Contopou- 
los 1956; Stark 1977), which is approximately consistent with ob- 
servations of some galaxies. However, many objects display po- 
sition angle twists and ellipticity variations, which cannot be re- 
produced by these simple models. More flexible mass models are 
therefore required to reproduce these observed features. 

A general approach to the triaxial deprojection problem would 
be to use fully non-parametric methods (e.g., Scott 1992). This has 
already been done in the axisymmetric case by Romanowsky & 
Kochanek (1997) and in the triaxial case by Bissantz & Gerhard 
(2002). Unfortunately, these methods are complicated, require a 
significant amount of time before convergence is reached, and do 
not always provide a global solution. 

We therefore decided to parameterize the mass distribution by 
using a Multi-Gaussian Expansion (MGE; Monnet, Bacon & Em- 
sellem 1992; Emsellem, Monnet, & Bacon 1994; Cappellari 2002). 
We assume that the intrinsic density can be described as a sum of 
coaxial triaxial Gaussian distributions. The Gaussians do not con- 
stitute a complete basis of functions and therefore cannot reproduce 



3.2 Transformation from intrinsic to projected coordinates 

To be able to compute the projection of the density in eq. Q} on 
the sky-plane, we introduce a new coordinate system, (x',y',z') 
as defined in Binney (1985). Here, z' is located along the line-of- 
sight and x is in the (x, y)-plane. 

To go between these coordinate systems two transformatons 
are needed. First, a projection to the sky-plane given by a projection 
matrix 



— sin <p 
- COS # cos If 
sin ■& cos tp 



cos tp s * 

- cos $ sin ip sin $ 
sin d sin tp cos -d j 



(2) 



where the two usual spherical coordinates (#, ip) define the orienta- 
tion of the line-of-sight with respect to the principal axes of the ob- 
ject. For example, (90° ,0°), (90°,90°), (0°,0°...90°) are the views 
down the long, intermediate and short axis, respectively. Secondly, 
a rotation on the sky-plane is given by the matrix 



(sin ip — cos ip ' 
cos ip sin ip 
1/ 



(3) 



The angle ip is required to specify the rotation of the object around 
the line-of-sight. The rotation ip is chosen to align the major axis of 
the projected ellipse (of the innermost MGE component, see eq. ® 
below) with the x'-axis. For an oblate axisymmetric intrinsic shape 
tp equals 90° . 

3.3 The observed surface brightness of an MGE 

The projected surface brightness (SB) that corresponds to the den- 
sity of eq. <[TJ can be written as a sum of two-dimensional Gaussians 
of the form: 



SB(R',9') = J2 



2af 



with 

x'j — R' sin(0' — ip'j) and 



R' cos(#' — ip'j), 



(4) 



(5) 



where (R ,6') are polar coordinates on the sky-plane. The Gaus- 
sian components have axial ratio ^ q'j ^ 1, dispersion a'j along 
the major axis, and position angle ip'j, measured counterclockwise 



© 2007 RAS, MNRAS 000. [711271 



4 van den Bosch et al. 



from the y'-axis to the major axis of each Gaussian. The misalign- 
ment angle ip'j cannot be measured directly as the position of the 
intrinsic y'-axis is not observable. We define 

ip'j = V + Aip'j with Atp[ = 0, (6) 

where Atp'j is the isophotal twist of each Gaussian, which can be 
measured directly. 

3.4 From projected to intrinsic shape 

To determine the parameters of the Gaussians in eq. (TJ, we fit the 
two-dimensional MGE model of eq. (|4j to the observed surface 
brightness. After assuming the space orientation cp, ip) of the 
galaxy, the relations between the observed quantities (a' 3 ■, q' 3 ■, ip'j) 
and the intrinsic ones {a-j,p 3 , q 3 ) are given by Cappellari et al. ( 
2002; for a different formalism see Monnet et al. 1992) 

2 S'j [2 cos2i/)j +sin2i/)j (sec -d cot ip — cos ■& tan </?)] ^ 

^ 2 sin 2- !? [S'j cos ip', (cos ip', +cot tp sec $ sin ip'- ) — l] ' 

2 2 <5j [2 cos2i/)j +sin2i/)j (cos i? cot 99 — sec tan y>)] 

Pj 9j 2sin 2 i9[(5^ cos^(cos V>j+cot V5seci9sint/'j) — l] ' 

u 2 = \ yjp^ cos 2 i9 + g 2 sin 2 #(p 2 cos 2 ip + sin 2 ^), (9) 

where 5j = 1 — qf, and u 3 = a' 3 /(j 3 , the scale-length projec- 
tion compression factor, which together with the dimensionaless 
parameters and define the intrinsic shape. The mathematical 
constraint q 3 > and p 3 > (or the stronger and more physi- 
cal constraint q 3 > 0.2 and p 3 > 0.4, which gives the range of 
reasonable axis ratios for an elliptical galaxy, Binney & de Vau- 
couleurs 1981) implies that each Gaussian can be deprojected only 
for a limited range of orientations (see also Monnet et al. 1992). 
The orientations for which the whole MGE model can be depro- 
jected are located in the intersection of the regions that are allowed 
by the individual Gaussian components. 

3.5 Constructing a realistic triaxial MGE 

The individual Gaussian components have no direct physical sig- 
nificance, but their parameters provide constraints on other, more 
important, quantities. We must therefore be careful that the MGE- 
model does not result in spurious conditions on the physical prop- 
erties of the galaxy. The allowed intrinsic orientation of the galaxy 
depends on the axis ratios of the Gaussians in the superposition. It 
can be easily verified numerically that the region in the space of the 
rotation angles (■£), ip, ip'j) for which a Gaussian with a given ob- 
served flattening q'j can be deprojected increases with q'j : a round 
Gaussian {q'j — 1) can be deprojected for any assumed intrinsic 
orientation, while an extremely flat one (q'j <C 1) can only be de- 
projected when the object is observed along one of its principal 
planes. Moreover, when a Gaussian has an photometric twist Aip'j 
with respect to the other Gaussians in the MGE, than the allowed 
deprojection region ($, ip, ip'j) becomes even smaller. 

The Gaussians in a given MGE-superposition generally have 
different values of q 3 and ip'j. This means that the MGE-model 
as a whole can only be deprojected for angles that appear in 
the intersection of the allowed individual regions (j),(p,ip'j) of 
the deprojection of the individual Gaussians. The largest depro- 
jectable volume is obtained by maximizing min{gj} and minimiz- 
ing max{| Aipi • |}, while still fitting the photometry within a certain 
accuracy (see also Cappellari 2002). This is verified numerically in 





40 


- 


/ (/ / 






30 




J A° 




twist 








o : 


sophota 


20 


f / 


/ 4° 


/ J: 


o 










"o 

i — 


10 






/ - 

/ / 











. it 




0.4 


0.6 0.8 


1 








Minimum flattening (q) 









Figure 1. Contours of the deprojectable volume of an hypothetical MGE 
as a function of the observed isophotal twist and flattening. The horizontal 
axis shows the minimum projected flattening and the vertical axis the maxi- 
mum isophotal twist of all the Gaussians in the MGE. The labels denote the 
percentage of Euler angle space that can be deprojected. 



Fig. [T] which shows contours of the allowed volume available for 
deprojection, for given minimum flattening and isophotal twist of 
an MGE model. 

The MGE models that are obtained in this way have, by con- 
struction, the largest set of orientations for which a triaxial depro- 
jection is possible. For any given orientation, the roundest projec- 
tion on the sky corresponds to the roundest triaxial intrinsic density. 
This means that this MGE model will be the roundest one that fits 
the observations for any given intrinsic orientation of the galaxy. 
Very boxy or very disky models are therefore excluded. 

3.6 Light to Mass 

At this stage it is possible to add the contribution of invisible mass 
to the gravitational potential of the model. This can be done by 
creating two MGE models for the galaxy. One for the gravitating 
matter and one for the visible light. The matter MGE is then used 
for the calculation of the potential and the light MGE is used to 
reproduce the intrinsic and observed light distribution. To simulate 
a radial M/L profile one can construct a matter MGE by multiply- 
ing the luminosity of each Gaussian of the light distribution with 
the desired (M/L) 3 at that radius, see for example van den Bosch 
et al. (2006). In this way, it is possible to construct a large range of 
potentials or surface brightness distributions, as long as the matter 
and light distributions can be represented by Gaussians. Alterna- 
tively one could make (Mj L) in eq. (TJ a function of (x, y, z). 

3.7 Deprojection 

The parameter range that has to be explored when fitting general 
triaxial models to observations of elliptical galaxies is large. Two 
axis ratios and three angles are needed to specify the intrinsic shape 
and orientation of a triaxial ellipsoid, while only the projected flat- 
tening q' 3 and the relative position angle Atp' 3 of the projected major 
axis can be deduced from photometric observations. As there is a 
relation between q' 3 , the viewing angles and the intrinsic shape pa- 
rameters of the galaxy, the allowed range of intrinsic shapes can 
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be constrained to some degree, but a large freedom remains. In 
some cases, additional information, such as the value of the kine- 
matic offset angle or the relative position of a gas disk or dust lane, 
can provide further constraints (e.g., Bertola et al. 1991). How- 
ever, unless two perpendicular gas disks are observed, no unique 
intrinsic shape can be deduced directly from the observations. The 
method to obtain triaxial MGE mass models from surface bright- 
ness data that was described in the above does not solve these prob- 
lems. However, it produces a range of regular, searchable and well- 
behaved triaxial density distributions that are consistent with the 
observed surface brightness, while being easy to handle computa- 
tionally. 

The shape of the reconstructed potential used in our models 
is directly related to the viewing angles by eqs. <f7j»-<[9jl . By chang- 
ing the viewing angles the potential of the model changes with it. 
However, the intrinsic shape parameters are much more natural pa- 
rameters than the viewing angles {&, tp, ip), as they influence the 
appearance of the orbits, and thus the kinematics they represent, 
much more directly. For example, for an axisymmetric deprojec- 
tion of the surface brightness the angle tp has no meaning, as rotat- 
ing along the symmetry axis does not change anything. Therefore 
we choose to study the effects of the deprojection in terms of the 
intrinsic shape parameters (p, q, u), which can be computed from 
the viewing angles tp, ip) given the averaged flattening q of the 
galaxy. The conversion from the intrinsic shape parameters to view- 
ing angles (which is the input for the models) is given b)Q 



Table 1. Summary symbols introduced in SectionfJ] 



Symbol Definition 



(x,y,z) Intrinsic coordinate system 
(x' , y' , z') Projected coordinate system 

tp, v) Viewing angles 
(p, q, u) Intrinsic shape parameters 
q' Averaged projected flattening 

Lj , cr'j , q'j Projected luminosity, dispersion, flattening 

of individual Gaussians 
(pj, qj,Uj) shape parameters of individual Gaussians 
Uj Intrinsic dispersion of individual Gaussians 

Misalignment angle of individual Gaussians 
Isophotal twist of individual Gaussians 



Aip> 



searched effectively. Since the models are computationally expen- 
sive the number of models cannot be too large. The MGE parame- 
terization of the surface brightness already excludes some viewing 
angles since their deprojection is unphysical (p < 0.4 or q < 0.4). 
Especially an isophotal twist reduces the allowed viewing angles. 
But also the sampling in the intrinsic shape, instead of viewing 
angles, helps reduce the number of models required, as this will 
avoid having models with (nearly) the same intrinsic shape. Over- 
all, for galaxies which are mildly flattened approximately 100 dis- 
tinct models are needed when sampling (p, q, it) in steps of 0.05. 



2 , (u 2 -q 2 )(q' 2 u 
cos v — 



(l-q 2 )(p 2 -q 2 ) ' 



tan ip 



(1 - u 2 )(l - q' 2 u 2 )(p 2 - q 2 ) ' 
(l-q> 2 u 2 )(p 2 -q' 2 u 2 )(u 2 -q 2 ) 
(1 — u 2 )(u 2 — p 2 )(q' 2 u 2 — q 2 ) 



(10) 



valid for q ^ p ^ 1, q ^ q' and max(g/g' ,p) ^ 11 ^ 
min(p/g', 1). Four of the eight possible solutions are unphysical 
or have q > p. The valid solutions are 



{ # , -<p , 4> }, 

{ , tp , Ip }, 

{ # , (p , -Ip }, 

{ 7T-1? , -If , -Ip }. 



(ID 



They represent the same intrinsic shape only seen from the oppo- 
site side and mirror images. They are thus identical and need not 
be modelled separately. However for Gaussians in the MGE with 
isophotal twist QAip'j | > 0) the intrinsic shape of the models with 
viewing angle ip and —ip are not the same, since the Aip'j offset 
deprojects (eqs. @-{9]l) them to a different (pj, qj,Uj), and thus a 
different intrinsic shape. Hence, in the case of isophotal twist we 
have to consider one solution from the first two lines in ( lilt , and 
one from the last two lines. 

To convert from (p, q, u) to (pj,qj, Uj) one uses eq.(|10t and 
the averaged flattening q to go to tp, ip), and then eq. ((SJ (and 
the observed isophotal twists) to go to tpj . From there one uses q'j 
and eqs. 0-(|9} to go to (pj ,qj,v,j). 

To find the best-fit intrinsic shape and corresponding view- 
ing angles for an observed galaxy the parameter space has to be 



1 The quantities u 2 and u 2 q' 2 are recognized as the conical coordinates 
/i and v, with which the projected properties of a triaxial ellipsoid of axis 
ratios p and q can be evaluated in an elegant manner (e.g., Franx 1988). 



3.8 Potential and accelerations 

The next step is to calculate the potential that corresponds to the 
mass distribution of eq. (TJ. This is done by using the classical 
Chandrasekhar (1969) formula for the potential that corresponds 
to a density stratified on similar concentric ellipsoids. This results 
in (Emsellem et al. 1994) 

i 



V(x,y,z) 
with 



drF(x,y,z,r), 



V 0J = (M/L) Jl9h. 



and 

F(x,y,z,r) = 
where 



exp 



+ 



V(l-M(l- ej T*) 



Pj 



and 



q 2 . 



(12) 



(13) 



(14) 



(15) 



Here, G is the gravitational constant and M/L is the mass-to-light 
ratio. Eq. J 1 2b has no simple analytic expression and must be evalu- 
ated numerically. The integrand is badly behaved in the central and 
outermost regions. It is therefore more efficient to replace eq. J 1 2b 
by analytical approximations in those regions. 

The central density of each Gaussian can be expanded as 



Pi( x ,y, z ) = Po,j ^2a n m 2n , 

n = 

with m 2 = x 2 + y 2 /p 2 + z 2 /q 2 and 
1 ( 1 



III 



2a] 



(16) 



(17) 
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This expansion generates a potential [e.g. eq. (29) of de Zeeuw & 
Lynden-Bell 1985] 



Vj(x,y,z) = — -M: 



+2Ai2,jX 2 y 2 + 2Ax 3>i x 2 z 2 + 2A 2 3,jy 2 z 2 ) + . 



(18) 



The index symbols Ai and An are given in Chandrasekhar (1969). 
For a moderately triaxial model, the expression Q8) differs less 
than 10 -4 from the exact potential for r < 0.1 <7j, with r 2 = 
x 2 + y 2 + z 2 . A higher-order Taylor expansion does not extend this 
limiting radius significantly. 

The potential outside r > 450-j can be approximated to within 
10 -4 by the monopole term in a multipole expansion, which corre- 
sponds to the potential of a central point mass with mass equal to 
that of the Gaussian 

GLj 



V j (x,y,z) = -(M/L) 



\/ x 2 + y 2 + z 1 



(19) 



Higher-order multipole terms hardly extend the range of applica- 
bility. Using equations d!8t and ( 119) , numerical integrations only 
have to be performed over the range O.loj < r < 45(7^, which 
speeds up the orbit integration significantly. 

The contribution of a central supermassive black hole is rep- 
resented by a Plummer potential 

, . G M, 

V.{x,y,z) = -- 



y/r% + x 2 + y 2 + z' 2 



(20) 



in which M. is the mass of the black hole and r s is a softening 
length, which can be set to a non-zero value to prevent the cen- 
tral potential to be infinite. In most applications, this smoothing is 
used, and r s is chosen to be significantly smaller than the smallest 
kinematic aperture. The black hole potential is added to V(x, y, z) 
from eq. j 1 2t to obtain the total galaxy potential. A separate dark 
halo potential can also be added at this stage, either using the MGE 
(see § |3.6I > or another, specific, expression. 

The orbit integration is performed in Cartesian coordinates. 
The stellar accelerations are given by the derivatives of equation 
d!2t with respect to x, y and z. Similar to what is done for the po- 
tential, the numerical calculation of the accelerations in the central 
and outer regions of the model are replaced by, respectively, a Tay- 
lor expansion and the dipole approximation. If we differentiate the 
terms in eq. J 1 3b . we obtain as first-order approximations 



3 

yVoj 

zV 



A i,j " ^2 (Ai ,]X 2 +A 12 , j y 2 +A 13 , ] z 2 ) 



A 



2,J" 



2a 2 



(A21JX 2 +A22jy 2 +A23,jZ 2 ) 



(21) 



■A 



■2,jy 2 +A 33J z 2 ) 



where we have suppressed the dependence of the left side of the 
equations on (x,y, z). These expressions differ less than a factor 
10~ 4 from the exact accelerations inside r < 0.1 Oj. As before, 
outside r > 45 Oj the accelerations can be approximated to within 
10~ 4 via the monopole term 

£G Lj 



{MjL) 



^/(x 2 +y 2 + z 2 ) 3 



(22) 



Similarly, the accelerations due to the black hole are given by 



yj (rf + x 2 + y 2 + z 2 



£ = x,y,z. 



(23) 



To make accurate and fast orbit integration possible, we interpo- 
late the total accelerations (a x ,a y ,a z ) onto a three-dimensional 
polar grid linearly in [log(r), 6, <f>]. For each grid point (r, 8, (f>) we 
store [log(— a x /x), \og(—a y /y), log(— a z /z)]. We can then com- 
pute the accelerations (a x , a y , a z ) at point (r, 9, <fi) with trilinear 
interpolation. After the interpolation grid has been computed we 
ensure that the minimum relative accuracy is better than 10 -4 . 



4 ORBITS 

Schwarzschild's method tries to find a numerical representation of 
the distribution function of a galaxy by assigning weights to a set 
of orbits. To avoid any bias and to allow for the maximum degree 
of freedom, the sample of orbits that the fitting routine can choose 
from must be as general as possible and 'representative' of the po- 
tential. In this section, we describe how this is achieved. We first 
introduce a triaxial Abel model from the companion paper (vdV07) 
that we use to test our method. We then discuss the orbit structure 
in separable and more general triaxial potentials. We continue with 
a description of the orbital initial conditions, orbit integration and 
storage grids that are used in our method. 

4.1 Separable test models 

The Abel models with a separable potential from the companion 
paper are a generalisation of the spherical Osipkov-Merritt models, 
introduced by Dejonghe & Laurent (1991) and extended by Math- 
ieu & Dejonghe (1999). These models have a distribution function 
(DF) that depends on three integrals of motions, contain a central 
core, and allow for a large range of (triaxial) shapes. The observ- 
ables of these models, including the LOSVD, can be calculated ef- 
ficiently and they can be used to generate test models that simulate 
realistic wide-field imaging and integral-field spectrograph kine- 
matics of galaxies. These mock observations serve as input for the 
triaxial Schwarzschild method presented in this paper. 

We use the triaxial test model from §4.3 of vdV07, which 
has an isochrone Stackel potential. This model resembles a triaxial 
1O 11 M0 galaxy at 20 Mpc with a kinematically decoupled compo- 
nent. We infer the potential from the MGE fit to the projected (total) 
surface mass density. To obtain the luminous mass density, we use a 
separate MGE that fits the surface brightness (assuming a constant 
(stellar) mass-to-light ratio of M/L — 4M0/Lq). The kinematics 
are constructed in such a way that they resemble SAURON observa- 
tions (Bacon et al. 2001). 

We will use this test model to demonstrate our method. More 
details and tests of the recovery of global parameters are given in 
section ^6]of this paper, whereas tests of the recovery of the internal 
structure and the DF can be found in the companion paper. 

4.2 Orbit structure 

In a separable triaxial potential, all orbits are regular and conserve 
three integrals of motion E, I2 and ^3, which can be calculated 
analytically. Four different orbit families exist: three types of tube 
orbits, which avoid the center and are therefore sometimes referred 
to as 'centrophobic', and a set of orbits that can cross the center, 
usually referred to as boxes or 'centrophilic' orbits e.g., Kuzmin 



© 2007 RAS, MNRAS OOO.mEHl 



Triaxial orbit based models 7 



2.0 


, i i i i 

- 

Equipotential 




- 


1 .5 


r i 

" Thin 






1 .0 


o \ 

X 






0.5 


_ 


s 


\ B \ 




Focal curve \ 

• s \ 






0.0 


I , , I I I 1 T 1 







0.0 0.5 1.0 1.5 2.0 

x 

Figure 2. The (x, z)-plane of a triaxial galaxy with a separable potential, 
for a value of the energy E that is large enough that all orbit families ap- 
pear. The figure shows the equipotential that corresponds to E, the focal 
hyperbola, the curve at which I2 = 0, and the location of the thin orbits. 
The regions where the different orbit families cross the (x, z)-plane per- 
pendicularly are indicated: 'B' denotes box orbits, 'S' corresponds to short- 
axis tubes and T and 'O' label inner and outer long-axis tubes. It can be 
seen that all tube orbits cross the (a;, z)-plane perpendicularly in two points: 
once in the region outside the thin orbit curve and once inside. This means 
that the (grey) region between the thin orbit curves comprises all orbits just 
once, which is important for the orbital sampling (Schwarzschild 1993). 



(1973); de Zeeuw (1985); Statler (1987). These different orbit fam- 
ilies conserve unique combinations of these integrals and can there- 
fore be linked to distinct volumes in phase-space. Maybe even more 
remarkably, all four orbit families in a separable potential cross 
the (x, z)-plane perpendicularly in well-defined regions (Fig. [2j 
Schwarzschild 1993). Similar to axisymmetry, all tubes except the 
so-called thin orbits (in which the inner and outer radial turning 
points coincide) cross the (x, z)-plane perpendicularly twice. At a 
given energy, these points are located in two distinct areas, sepa- 
rated by the line that connects the points of the thin orbits. This line 
can be parameterized analytically in a separable potential. 

These properties are summarized in Fig. [2] where we have 
used the isochrone separable potential of the triaxial Abel model. 
The figure shows the (a;, z)-plane for a value of the energy that is 
large enough that all orbit families are populated. The thick outer- 
most curve is the equipotential at this energy, the inner- and outer- 
most decreasing curves inside the equipotential connect the points 
where the thin orbits cross the [x, z)-plane perpendicularly, the in- 
termediate decreasing curve corresponds to I2 = 0, and the rising 
curve is the focal hyperbola. The four areas corresponding to the 
different orbit families are also indicated (see §5.4 of vdV07 for 
further details). 

This orbital structure depends crucially on the presence of a 
central core and is (partially) destroyed by the addition of a super- 
massive black hole and/or a central cusp (Gerhard & Binney 1985). 
Some orbits in these non-separable potentials do not conserve 
global integrals of motion other than the energy E and may not all 



cross the [x, 2) -plane perpendicularly. The three types of tube or- 
bits, including the thin tubes, are still supported cf. Schwarzschild 
(1993). Most box orbits are transformed into boxlets (Miralda- 
Escude & Schwarzschild 1989) and orbits that occupy certain parts 
of phase space become chaotic. The amount of chaotic motion and 
the radial range inside which it is present depends on the central 
cusp slope (see i]4.6t . 

4.3 Initial conditions 

The orbits in our models are more complicated than those in a sep- 
arable potential, as we use a more realistic MGE potential with a 
supermassive black hole. Still, we use the properties of separable 
models in our sampling of initial conditions. We sample the orbital 
energy implicitly through a logarithmic grid in radius. When the 
model has to reproduce observational data, it is important to sam- 
ple the orbital energy on a grid with a minimum radius that is at 
least an order of magnitude smaller than the pixel size of the obser- 
vations. In the case of Hubble Space Telescope data, this typically 
corresponds to ~ 10 -2 arcsec. The outer grid radius is determined 
by our constraint that the grid must include ^99.9 per cent of the 
mass. 

Each of the grid radii r% is linked to an energy by calculating 
the potential at (x, y, z) — (r%, 0, 0). The orbital initial conditions 
are then sampled from a dense grid in the (x, z)-plane. Since most 
orbits cross the (a;, z)-plane perpendiculary twice above js > it is 
not necessary to sample the whole plane. The double-countings are 
avoided by finding the location of the thin orbit curves iteratively: 
we launch orbits at different radii [keeping 9 = arctan(a; / z) fixed] 
until the width of the orbit is minimal. This is similar to what was 
done in the axisymmetric three-integral models, where all orbits are 
short-axis tubes. 

The starting points (x, z) are selected from a linear open polar 
grid (R, 9) in between the thin orbit and the equipotential (the grey 
area in Fig. [2}- The initial velocity in the y direction is determined 
from vlo = 2[V(x , 0, z ) - E t ] and {v x ,v z ) = (0,0). This 
three-dimensional set (E, R, 9) of starting conditions is commonly 
referred to as the '(x,z) start space' (Schwarzschild 1993). It is 
sufficient to launch orbits in only one direction when the density 
(or another quantity that is even in the velocity, such as the second 
moment) has to be reproduced. When the velocity (and odd higher- 
order velocity moments of the distribution function) is fitted in the 
model, the direction of the orbital motion is also important. This in- 
formation could be taken into account directly by launching orbits 
in both the positive and negative y-direction. However, the trajec- 
tories of the pro- and retrograde orbits are identical, which means it 
is much more efficient to include the counter-rotating orbits only at 
the fitting stage by reversing the velocity sign appropriately. This is 
only valid when figure rotation is ignored cf. Schwarzschild (1982). 

Since boxes are essential for the support of the triaxial shape 
(Schwarzschild 1979; Hunter & de Zeeuw 1992), a library with 
relatively few of them cannot be expected to reproduce a triaxial 
mass model. The (x, z) start space has few box orbits, especially 
at large radii (see Fig. |3)- To make sure that the orbit library pro- 
vides enough freedom in the outer parts of the model, we add ad- 
ditional box orbits, like Schwarzschild (1993). Box orbits always 
touch the equipotential (Schwarzschild 1979). We therefore sample 
start points on (successive) equipotential curves, using linear steps 
in the two spherical angles 9 and <j>. At each combination of (8, <f)) 
and E, we use bisection to find the corresponding value of ro that is 
located on the equipotential. This three-dimensional set (E, 9, (j>) of 
start conditions, the 'stationary start space' (Schwarzschild 1993), 
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Figure 3. Representation of the (x, z) and the stationary start space and their symmetries for the triaxial Abel model from vdV07. The panels show the orbital 
starting points for increasing energies (denoted at the top), from an inner shell of the model (left top) to an outer shell (right bottom). The symbols represent 
the position of the orbits in the start spaces. The orbits in the inner right quarter are in the (x, z) start space and the orbits placed in the outer right quarter are 
in the stationary start space ( £|4.3t . The thick black line represents the equipotential. (Compare to Fig. [2]) The orbits in the inner left quarter are the orbits from 
the (x, z) start space with reversed angular momentum and the orbits placed in the outer left quarter is identical to the outer right quarter and are only drawn 
to make the panels symmetric. The symbols show the result of the orbit classification (based on angular momentum conservation, £|4. 5Y . the crosses are box 
orbits, the stars correspond to short axis tubes and the diamonds correspond to (both types of) long axis tubes. We have also overplotted the analytical curves 
that separate the different type of orbits (see also Fig.|2]and vdV07). The solid rising curve is the focal hyperbola, with above it the long axis tubes and below 
it the short axis tubes and boxes. The crossing solid declining curve separates respectively between the inner and outer long axis tubes, and between the short 
axis tubes and boxes. The thin curves indicate the location of the corresponding thin tube orbits. 



results in box orbits or boxlets only. Tube orbits always conserve 
the sign of at least one component of the angular momentum and 
therefore never reach zero velocity. Since the direction of the orbits 
in this start space is not important it is not necessary to add velocity 
mirrored copies of them during the fit. 

By design the set of energies E and angles in 6 in both start 
spaces are identical, so that the orbits on the equipotential bound- 
ary of the (x, z) start space have obvious neighbours in stationary 
start space. While not necessary, the size of the third dimension of 
the start spaces is chosen to be the same for consistency. Both sets 
of orbits can be represented in a single figure, by graphically con- 
necting the corresponding starting spaces at the equipotential, as 
shown in Fig. [3] where selected energies of the triaxial Abel model 
(§ I4.lt are shown. In this figure we have overplotted the same lines 
from Fig. [2] which shows that our numerical scheme to locate the 
thin orbits indeed results in an orbit sampling from the correct re- 
gion. The stationary start space intersects with the x-z start space at 
the equipotential. In the figure all the orbits in the stationary start 
space that are nearest to the equipotential are plotted just outside 
the equipotential. Subsequent rows in the stationary startspace are 
plotted radially outwards. A mirror image of the stationary start 
space is also plotted for symmetry. 

4.4 Dithered orbit integration 

The orbits in the start space are integrated numerically and their 
properties stored. The integration is done in Cartesian coordinates, 
using an explicit Runga-Kutta method of order 5(4) (DOPRI5 rou- 
tine by Hairer, Norsett & Wanner 1993). With this method, the ma- 
jority of the orbits can be integrated with energy accuracies of better 



than one part in 10 5 . This routine is capable of dense output, which 
enables you to get a interpolated position and velocity at any time 
in current timestep during the integration. 

To ensure numerical precision the Runga-Kutta integrator uses 
more steps where the orbital trajectory changes direction quickly. 
Since this usually happens when the 'star' is traveling with a 
high velocity, the integrated time steps do not represent the time- 
averaged path of the orbit. To make sure this is not a problem we 
use the dense output of the integrator, to sample the orbit on equal 
time intervals, ensuring that the orbits are properly time weighted. 

Single orbits correspond to delta-functions in integral space, 
while the distribution function of a (partially) phase-mixed galaxy 
is expected to vary smoothly (Tremaine, Henon & Lynden-Bell 
1986). This limitation may be reduced by combining nearby or- 
bits (Richstone & Tremaine 1988; Rix et al. 1997). Here we extend 
this method by 'dithering' orbits in all three dimensions in the ini- 
tial starting space. We do this by taking a bundle of 5 3 orbits with 
different, but adjacent, initial conditions and sum their observables. 
This method is also used in the construction of axisymmetric mod- 
els see Cappellari et al. (2006). 

When calculating the starting spaces for the orbits we create 
more starting points for the dithering. We enlarge the sampling 
three-dimensional (E,8,(f)) start spaces five times in each direc- 
tion. This leads to 125 orbits per bundle. The odd number five was 
chosen so that each bundle has a clearly defined central orbit (see 
figure 6 in Cappellari et al. 2006). The orbital properties of each of 
the orbits in each bundle are simpy co-added. As an alternative, it 
is be possible to apply some form of (Gaussian) weighting. In this 
way the orbit bundles could be made to overlap, but the effects of 
this have not been studied. 
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Effectively, the model thus contains 125 times more orbits. 
The dithering causes the orbital building blocks to be smoother, 
eliminating aliasing effects, especially when modeling spatially ex- 
tended kinematic data. We found that this dithering is essential to 
obtain smooth orbital observables and remove numerical noise, us- 
ing a limited amount of orbits. 

4.5 Storage grids and symmetries 

For spherical galaxies, it is in principle sufficient to store the orbital 
properties in one dimension, along a line. The three-dimensional 
model can be reconstructed afterwards by deprojecting the radial 
profile back onto the sphere. Similarly, axial symmetry allows one 
to carry out the calculations in the meridional plane only. Revolu- 
tion of the model around the intrinsic short axis returns the three- 
dimensional intrinsic properties. As we restrict ourselves to station- 
ary, non-rotating galaxies that are symmetric in the three principal 
planes, all orbital properties have to be calculated in only one oc- 
tant. The properties in the other octants follow by symmetry. 

The density of every orbit in the library is stored on a spher- 
ical grid in (r g , 9 g ,4> g ) [0 g = corresponds to the short axis and 
(0 g — 7r/2, 4> g — 0) to the long axis]. The radial sampling is log- 
arithmic with the inner and outer boundary set to zero and infinity. 
The angular grids 9 g and cj> g are sampled linearly between and 
tt/2. The grid has N rg = 15, Ng g = 4 and N 4 , g = 5. This leads to 
20 bins per radius and 300 bins in total, which is enough to ensure 
that the mass is reproduced well and the model is self-consistent. 
When fitting the model, the intrinsic mass grid is used as a con- 
straint and is fitted everywhere with an accuracy of 2 % (see [Q. 

Similar to the intrinsic symmetries, the projected properties of 
spherical galaxies are one-dimensional and those of axisymmetric 
galaxies are symmetric in the projected axes. It is therefore suffi- 
cient to store the projected properties of spherical galaxies in one 
dimension and those of axisymmetric objects in one quadrant of the 
sky. The projected properties of triaxial galaxies are at most point- 
symmetric, with respect to the projected centre, which implies that 
the model-data comparison must be done in one half (or more) of 
the sky plane. 

To convert the intrinsic coordinates (x,y,z) to the projected 
coordinates (x',y',z') we use eqs. {2} and {3}. After this step the 
PSF is included by randomly perturbing the projected coordinates 
(x',y') with a probability described by the MGE PSF, before being 
included into the observational apertures (identical to Cappellari 
et al. 2006). We use a three-dimensional rectangular storage grid in 
the projected Cartesian coordinates x' and y' and the line-of-sight 
velocity v in the the sky plane. The resolution and rotation of this 
grid is adapted to the kinematical data that have to be reproduced. 
Optionally, the observational apertures can be binned as a final step 
to match any obervational binning. 

Only orbits with the correct degree of symmetry can be used 
to reproduce the density and potential. All orbits in a separable po- 
tential are indeed eightfold symmetric, but this need not be the case 
for resonant and irregular orbits in more general potentials. These 
orbits can therefore not be used directly in the reconstruction of the 
potential and density that we are interested in. This does not mean 
that they are useless, as we can enforce the required symmetries by 
apply a folding scheme to these orbits. Again, this scheme is sim- 
ilar to what is done for axisymmetric potentials, except that only 
orbits that are not symmetric with respect to the z = plane have 
to be corrected in that case [e.g. 1 : 1 (R, z) resonances, Richstone 
1982]. 

The folding scheme is based on the fact that a given asym- 



Table 2. The recipe that is used to mirror orbits in the three principal planes. 
Long-axis tubes are abbreviated by L-tube and short-axis tubes by S-tube. 



Position 



Box 



L-tube 



S-tube 



(x,y,z) 
(-x,y, z) 
(x, -y,z) 
(x,y, -z) 
(-x,-y,z) 
(-x,y, -z) 

-y, -z) 



(-'■ 

(v x 

(v x 



(Vx 



Vy,V z ) 
X,Vy,V z ) 
—Vy,V Z ) 
Vy, ~V Z ) 



-Vy,V z ) ( 

Vy,~V z ) ( 
V y 



(V x ,Vy,V z ) 
(-V x ,Vy,V z ) 

(Vx,Vy, ~V Z ) 
(V X , ~Vy,V z ) 

Vy, ~V Z ) 
~Vy,V z ) 



(Vx,Vy,V Z ) 
(V X , -Vy, V Z ) 



(- 



,v z ) 



—y, -z) (-vx,- 



-l-u 



~V Z ) (V x ,- 
, ~V Z ) (-Vx 



-Vy 



-v z ) 



(Vx,Vy, ~V Z ) 
(-Vx, ~Vy,V z ) 
(Vx,-Vy, ~V Z ) 
(~Vx,Vy, -V Z ) 



-v u 



-«.) (- 



-v z ) 



metric orbit has up to seven mirror images that are obtained by 
reflection in the principal planes (see the first column of Table [2j . 
These mirror images are also supported by the potential, but do not 
appear in the library because we sample orbital initial conditions 
only from one octant. All eight mirror orbits have identical prop- 
erties and are equally useful for the model. We may therefore add 
these eight orbits to obtain an orbit that has three planes of symme- 
try. In practice, this is done as follows. During orbit integration, the 
orbital weight that corresponds to a given point (x,y,z) is equally 
distributed over the eight mirror points. The contributions to both 
the intrinsic and projected density of all eight points are added up 
into one orbital building block. In this way, orbits that are asym- 
metric are included correctly, while orbits that already are eightfold 
symmetric are simply sampled more densely. 

More attention is required when calculating the kinematical 
observables of the orbital building blocks. If we reflect the veloc- 
ities in the same way as the coordinates, the total orbital building 
block has no net angular momentum. This is only correct for box 
orbits, while tube orbits, which are essential when fitting to the ob- 
served velocity field, conserve the sign of at least one component 
of the angular momentum vector. Therefore, the sign of the angular 
momentum of these orbits must be preserved also in the total or- 
bital building block. This is ensured by classifying the orbits on the 
basis of their angular momentum properties. Box orbits oscillate in 
all three directions, so that no components of the angular momen- 
tum are conserved, while long-axis and short-axis tubes conserve 
the sign of the angular momentum parallel to the long and short 
axis, respectively. This allows us to distinguish orbits by checking 
for which angular momentum component(s), if any, the sign is con- 
served during orbit integration. In doing this, inner and outer long 
axis tubes cannot be recognized separately. This is, however, not a 
problem for the present application (cf. Schwarzschild 1979). As 
can be seen from Fig. [3] where we have plotted the different orbital 
types with different symbols, the numerical classification agrees 
with the analytical calculations. 

We then apply the following scheme for reflections in the prin- 
cipal planes (see Table |2j- Box orbits: the average angular mo- 
mentum is zero, which allows us to reflect the velocity compo- 
nents in exactly the same manner as the coordinates. Long-axis 
tube orbits: the sign of the angular momentum around the long axis, 
L x = yv z — z v y , is conserved, which means that L x must be the 
same for all eight mirror points. Short-axis tube orbits: the sign of 
L z = x v y — y v x is conserved. 
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4.6 The influence of a central mass concentration 

Central mass concentrations have considerable influence on the or- 
bital structure of the galaxy as a whole and may induce chaotic 
behaviour. In an axisymmetric potential, the non-integrable regions 
of phase-space (usually referred to as the Arnold web) are not con- 
nected. This means that the diffusion of chaotic orbits is limited 
and their influence on the model is probably not significant, which 
justifies the fact that chaotic orbits are not treated in a special man- 
ner in axisymmetric models. Realistic triaxial potentials can sup- 
port a much larger fraction of chaotic orbits. The overall amount 
of chaotic motion depends on the cusp slope and on the mass of 
the central mass concentration (Gerhard & Binney 1985; Merritt & 
Fridman 1996; Valluri & Merritt 1998), but the different orbit fam- 
ilies experience fundamentally different effects, due to the central 
mass concentration 

The box orbits that are started from the inner most equipoten- 
tials are very difficult to integrate numerically. The central acceler- 
ations are large and the time-steps that are required to conserve the 
integration accuracy are correspondingly small, resulting in pro- 
hibitively long orbital integration times. This effect can be reduced 
by using a nonzero value for the softening length that was intro- 
duced in £|3 . 8 1 The DOPRI5 routine that we use to integrate the or- 
bits varies the time steps to match the desired accuracy. We found 
that even orbits that pass close to the black hole can be integrated 
with an accuracy of 1CP 5 in a reasonable time. The softening length 
that we used is typically two orders of magnitudes smaller than the 
radius of the sphere of influence of the black hole. This sphere of 
influence is defined as R. — GM/a 2 , which is the radius inside 
which the black hole potential dominates. 

Test particles that are dropped from equipotentials at some- 
what larger distances from the black hole are scattered off their 
original (box) orbits and may become chaotic (Gerhard & Binney 
1985). The trajectories of such particles can be described by a se- 
ries of regular segments and, given enough time, will fill most of the 
equipotential that corresponds to the orbital energy. Since equipo- 
tentials are rounder than equidensity curves and box orbits are the 
backbone of the triaxial shape, this process may destroy the triaxial 
shape from the inside out (e.g. Poon & Merritt 2002). 

Chaotic orbits are not time-independent, since their orbital 
densities do not average out on physical time-scales. In princi- 
ple, this means that a Schwarzschild model with an orbit library 
that includes irregular orbits is not stationary. However, evolution- 
ary studies of models that include chaotic orbits (Schwarzschild 
1993) and N-body simulations (Merritt & Fridman 1996; Holley- 
Bockelmann et al. 2002) display no dramatical shape-changes, even 
after very long times. This means that a model with chaotic orbits 
may be stationary for as long as a Hubble time and Schwarzschild 
solutions can be constructed also for models that contain chaotic 
orbits. 

The use of dithered orbital components (§ 12.2b is critical to 
create nearly time-independent models. In fact orbits started from 
similar initial conditions can follow very different trajectories. Be- 
cause of the dithering single 'sticky' chaotic orbits do not play a 
major role, since orbits are always bundled with nearby orbits. 

The influence of a central mass concentration on tube orbits 
is radically different. Low-energy tubes, which orbit at large radii, 
never approach the central mass concentration close enough to be 
significantly disturbed. Tubes that are launched from the principal 
planes close to the radius of influence of the black hole turn into 
precessing ellipses. Depending on the shape of the volume that the 
ellipse eventually fills, they may be labeled as pyramid orbits (Poon 



& Merritt 2002) or lens orbits (Sridhar & Touma 1999; Sambhus 
& Sridhar 2000). The precession rate of the ellipse is determined 
by the ratio of the stellar mass that is enclosed by the orbit and the 
central black hole mass. The integration time that is required for 
convergence of the orbital properties is therefore inversely propor- 
tional to the orbital radius. 

4.7 Number of orbits 

To summarize, we use two start spaces with three dimensions 
(E, 8, R) and (E, 8, <fi), which are connected at the equipotential 
boundary of every energy. The total number of orbits in the fit, ex- 
cluding dithering, is three times the number of orbits in the one start 
space, as the (x, z) start space is used twice and the stationary start 
space once. The number of orbits is denoted as 3 x Ne X Ne x Nr. 
Because of the dithering each effective orbit consists of 125 orbits, 
significantly smoothing the orbital component. The total number of 
orbits necessary to make a model is dependent on several factors: 
the number and spatial distribution of the observed kinematics, the 
size of the galaxy model and the shape of the potential. 

The effect of the number of orbits can be studied by determin- 
ing the quality-of-fit x 2 of the model as a function of the number 
of orbits. With increasing orbit numbers the % 2 decreases. When 
enough orbits are present, the model does not improve any more 
and the \ 2 does not decrease anymore. In our test cases we find that 
the model do not improve considerably when the orbit library con- 
sists of 2000 orbits or more. Self-consistent models with smaller 
orbit libraries have significantly larger x 2 , due to the fact that there 
are not enough orbits to reproduce the mass, especially at larger 
radii. We therefore decided to use an orbital sampling of 21 equipo- 
tential shells with 8 x 7 (8, R) starting points each, totaling 3528 
(x!25) orbits. 



5 SUPERPOSITION AND REGULARISATION 

To make a model with the computed orbits we need to combine the 
orbits in such a way that they fit the observations, while reproduc- 
ing the (intrinsic) mass distribution for self-consistency. Here we 
describe the construction of the orbital superposition and a way to 
ensure that our numerical solution is realistic. 

5.1 Finding the orbital weights 

The model has two components that need to be fitted: The kine- 
matic observations and the (intrinsic and projected) mass distri- 
bution. The kinematics are fitted using linearly superposed mass- 
weighted Gauss-Hermite (GH) moments (Rix et al. 1997). The fit 
is done by combining the orbits linearly by assigning each orbit 
an orbital weight (74). These orbital weigths directly represent the 
mass in each orbit and must therefore be positve (7^ ^0). 

The intrinsic mass grid W,5b and the aperture masses (the 
total amount of mass in each observed aperture: the zeroth GH mo- 
ment) must be added to the fit to ensure that the model is self- 
consistent with the density in which the orbits were calculated. Of- 
ten this is done by including them in the fit as an 'observable' (e.g., 
van der Marel et al. 1998; Valluri et al. 2004). However they are 
not actually observed and therefore it is difficult to assign an error. 
To include them into the fit they are usually assigned a hand-tuned 
fractional error so that the mass is reproduced well without influ- 
encing the fit of the kinematics. Here we use a different approach 
by including them as 'constraints' with bounds in the fit (similar to 
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Richstone & Tremaine 1988). This means that the orbital superpo- 
sition reproduces the intrinsic and aperture masses to within 2 % at 
all times, while finding the best-fitting kinematics. The total nor- 
malised mass of all the orbital weights is fixed using an equality 
constraint. 

The reason for including constraints is that the mass can al- 
most always be reproduced up to numerical percision (van der 
Marel et al. 1998; Poon & Merritt 2002) and is thus not relevant 
for finding the best-fit solution. We only want the mass to be re- 
produced to within 2 %, because this is the typical accuracy of (the 
MGE of) the observed surface brightness. Within these bounds the 
solver allows the mass to vary to find the best-fitting kinematics in 
the least squares sense. 

We use the sparse quadratic progamming solver QPB from 
the GALAHAD library (Gould, Orban & Toint 2003) to make the 
superposition, as this algorithm is capable of fitting the kinematics 
in the least squares sense while satisfying mass constraints. This 
algorithm optimizes the orbital weight 7 in the least squares sense: 



mm 

76-R" 



|A<7 - 



subject to the positivity constraint 

7 > 0, 

and the linear constraints, 
0.98 p M7 1.02 p. 



(24) 



(25) 



(26) 



Here A is the m x n projection matrix whose n columns give the 
model contribution of every orbit to the m kinematical observables 
b. The matrix M is a projection matrix giving the model contri- 
bution of the orbits to the mass in the various apertures and p is 
the mass derived by integrating the MGE model over the projected 
apertures and intrinsic mass grid. The total number of constraints in 
the fit (p) is 300 (intrinsic mass grid) + 1 (total mass) + the number 
of apertures (aperture masses). 

The quality of the model is determined by measuring the dis- 
crepancy between the model and the observations for different val- 
ues of the input parameters. This is done by calculating the x 2 , 
defined as 



X 



E 



Pi ~ Pi 



(27) 



in which is the number of observables (the number of apertures 
times the GH moments), Di is the observation for the ith observ- 
able, D* is the model prediction and ADi is the uncertainty that is 
associated with this value (the observational error). 



5.2 Regularisation scheme 

The quadratic programming problem to be solved is ill-conditioned 
in most applications, due to (close to) degenerate orbits. As a con- 
sequence, the orbital weight distribution for the solution with the 
smallest \ 2 ma y be a rapidly varying function, which is not likely 
to be realistic (Merritt 1993; Verolme & de Zeeuw 2002). This ef- 
fect can be reduced by adding linear regularisation equations to the 
problem (e.g., Zhao 1996; Rix et al. 1997), which is also known as 
'damping' in the field of linear programming. Such regularisation 
terms can be added to force the orbital weights towards a smoother 
function by minimizing their higher-order derivatives. 

Our two start spaces are sampled in three dimensions 
(E, R, 6) and [E, 6, (f>) and they are connected at the equipotential 
boundary. The three dimensions of the (a;, 2) start space roughly 



correspond to the three integrals of motion (in a separable poten- 
tial). We can thus enforce smoothness of our solution by adding 
regularisation terms to our minimisation routine in each of the three 
directions (k, I, m) of our start space. We adopt the second order 
finite differencing (Press et al. 1992) regularisation from Cretton 
et al. (1999), so for each orbit we add three equations to the ar- 
ray A given above, and minimises them in a least squares sense 



^(2£fc7(fc,;,m) — £fc-i7(fc_i,i, m ) — £fc+i7(fc+i,i,m)) = 0, 

M^kJ(k,l,m) — £fc7(fc,!-l,m) — £fc7(fc,i + l ,m) ) = 0, 
M^kJ(k,l,m) — €fc7(A,!,m-l) — ikJ(k,l,m+V)) = 0, 



(28) 



where 7(fc l ;, m ) represents the orbital weight at position (k, I, m) in 
the start space grid. The ^ weights are added to include the radial 
energy dependence of the model. It is estimated, a priori, as the 
normalised mass enclosed by each radial shell in the start space 



1 
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(29) 



where N is the number of orbits. The regularisation error A de- 
termines how much smoothing is perfomed. Increasing A increases 
the amount of smoothing. The optimal value of this A is usually 
time-consuming to determine (see, e.g., Cretton et al. 1999). How- 
ever, it has been shown elsewhere that a theoretical axisymmetric 
galaxy with a two-integral distribution function can be accurately 
reproduced by using this approach (Verolme et al. 2002; Cretton & 
Emsellem 2004). Applications of the axisymmetric Schwarzschild 
method have often used a value of 1/A = A = 0.25 as regular- 
isation (Cappellari et al. 2002; Krajnovic et al. 2005). As we will 
show in the next section it is also acceptable for the triaxial method. 



5.3 Testing the regularisation 

In the companion paper the DF of the triaxial test model is com- 
pared directly in terms of the integrals of motion (§ 5.4.3 in vdV07) 
and we find that the recovery of the DF is consistent with the quality 
of the input kinematics (Figs 12 and 13 in vdV07). The triaxial test 
model is ideally suited to test regularisation. To do this we com- 
pare the orbital mass weights directly. The top row in Fig. [4] dis- 
plays the computed orbital weights and kinematics for the triaxial 
Abel model from vdV07. The other rows show the orbital weights 
and kinematics for the best-fit Schwarzschild model with decreas- 
ing regularisation from top to bottom. The orbits at the radius of 
2" and 33" are outside the range where they are constrained by the 
kinematics, and as such the orbital weights for the Schwarzschild 
models are not expected to compare well with those of the Abel 
model. 

The distribution of (analytical) orbital weights for the Abel 
model is smooth with some sharp peaks. The reconstructed orbital 
weights from the Schwarzschild agree well with the analytical or- 
bital weights, except for high values of A, which corresponds to 
strong regularisation. The orbital weights of strongly regularised 
models are distributed more smoothly, adjacent orbits receive sim- 
ilar weight and the kinematics start to change. From Fig. [4] we see 
that the kinematics is affected by the regularisation at A > 0.2, and 
thus the optimal regularisation in this case has to be chosen to be 
A < 0.2. This will give satisfactory orbital weights, and kinematics 
that are consistent with the observations. The comparison is best 
for a A ~ 0.1 

There are many reasons why the reconstructed orbital weights 
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Figure 4. The effect of regularisation on the models. The top row shows the triaxial Abel from van de vdV07, while the other rows show the best-fit 
Schwarzschild models with decreasing regularisation, from top to bottom. (For a detailed comparison of the DF see vdV07.) The left columns show the orbital 
weights of the models in the same configuration as Fig. [5] whereas the two rightmost columns show the velocity and velocity dispersion fields. 



do not exactly match the analytical orbital weights. Most of them 
are minor numerical and discretisation effects. For example: (i) 
The Schwarzschild method samples the galaxy with discrete or- 
bits (computed in the reconstructed potential), which are related to 
the DF via their phase-space volume (Vandervoort 1984). The re- 
sulting orbital mass weights are evaluated in an approximated and 
numerical way (§5.4 in vdV07). (ii) Some symmetric orbits appear 
twice (or more often) in the orbit library. Without regularisation the 
(quadratic) solver ignores the second identical copy of this orbit, as 
they do not improve the fit. However these orbits do get assigned 
weight in the test case. A good example of this are the box orbits 
in the (x, z) start space, as they are added twice to the fit [like all 
(x, z) orbits]. These differences are only visible in this direct com- 
parison of the orbital weights. 

To estimate the effect of regularisation on the kinematical 
fit more quantitatively, we investigated the % 2 difference between 
the models with different values of the regularisation A. For the 
best-fit model with 2370 kinematical observables the total % 2 > s 
2588. Adding regularisation does increase the \ 2 °f the model. 
With A = 0.01 (little regularisation) it does not affect the fit to 



the kinematics (A^ 2 ~ 1)- When increasing A furter to 0.2 or 
even 4 (very strong regularisation), the A\ 2 changes to ~ 200 
and ~ 1000, respectively. These numbers reflect what one sees in 
Fig. [4] for A < 0.2 the kinematical fit does not change visibly, 
whereas for higher A (stronger regularisation) the kinematical fit 
becomes rapidly worse. 

One other important question is whether the regularisation 
changes the recovered input parameters, including the viewing an- 
gles, mass-to-light raio, anisotropy and black hole mass. This is 
nearly impossible to test with real galaxies, as their properties are 
unknown. The Abel model has known parameters and was used to 
test the recovery. We found that there is no difference in the best- 
fit parameters when a regularisation of A = 0.2 was chosen. The 
confidence intervals of the parameters do become smaller by using 
regularisation. This is expected, as the added regularisation terms 
decrease the freedom of the model and therefore increase the \ 2 - 

A notable exception, that we do not test here, is the recovery 
of the black hole mass. There are often few observables in the mod- 
els near the sphere of influence. The number of mass bins near the 
black hole is extremely limited and the kinematical observations 
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inside the sphere of influence of the black hole is very limited, usu- 
ally less than 10 observables. In this scenario it is conceivable that 
regularisation is needed, as the model might otherwise adapt the 
orbital structure to be able to accomodate the black hole (see also 
e.g., Magorrian 2006). Recovery of the black hole mass using reg- 
ularisation will be presented elsewhere. 



6 TESTS ON THE TRIAXIAL ABEL MODEL 

We test our method on the triaxial Abel model from the companion 
paper vdV07, introduced in i)4.1l and already used in § 15.31 above. 
Here, we outline further tests done on the Abel model. 

6.1 Internal orbital structure and DF recovery 

In vdV07 the best-fit triaxial Schwarzschild model to the input tri- 
axial Abel is presented. The Schwarzschild method only uses the 
information that can be observed in real galaxies, i.e. the two- 
dimensional surface brightness and the two dimensional stellar 
kinematics. The resulting best-fit model is an excellent fit and has 
a (reduced) \ 2 P er degree-of-freedom of ~ 1.1. 

Since Schwarzschild models only fit the projected observables 
it is not obvious that these models can recover the three-integral DF 
and the internal structure of the test model. By comparing the mass 
weights of the Schwarzschild model to the DF of the test model, 
vdV07 demonstrate that both the internal orbital structure and DF 
are recovered, with an accuracy similar to the typical (simulated) 
errors on the kinematics. 

6.2 Recovering the global parameters 

When constructing Schwarzschild models of the Abel model, we 
expect the kinematics of the Schwarzschild model to vary when 
changing intrinsic shape parameters (p,q,u), and thus the view- 
ing angles tp, i[>). The most obvious are the change of the zero- 
velocity curve as the projected axes change on the sky. Also, the 
characteristics of the orbits in the model are dependent on the shape 
of the potential. These effects are shown in Fig.[5] by showing mod- 
els of the analytic test data in linear steps of 0.05 in p or q. The 
models become significantly worse when changing the parameters 
away from the correct values. This shows that different intrinsic 
shapes support different orbits and that one cannot expect a model 
with the wrong potential to be able to fit the kinematics in all cases. 

To search the global parameters we sample the parameter 
space, by making linear steps of 0. 1 Mq/L© in M/L, and 0.05 in 
p, q and u (resulting in 100 different intrinsic shapes). For each cor- 
responding Schwarzschild model, the changes are quantified by the 
goodness-of-fit parameter A\ 2 ■ To visualize this four-dimensional 
parameter space, we calculate for a pair of parameters, say p and 
q, the minimum Ax 2 as a function of the remaining parameters, u 
and M/L in this case. The contour plots of the resulting marginal- 
ized A\ 2 for all different parameters for the Schwarzschild mod- 
els fitted to the observables of the Abel model are shown in Fig. [6] 
Since we sampled in intrinsic shape and not in viewing direction, 
the viewing angle sampling is not uniform. In particular the very 
round models, which are independent of <j) are not represented prop- 
erly. To this end, we create a dense grid in (p, q, u) and interpolate 
the x 2 linearly over this dense grid, resulting in the contour plots 
of V) in Fig.[6] 

The input parameters for which the simulated observables 
of the Abel model were obtained are M/L — 4Mq/L0 and 
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Figure 6. Marginalized contours maps of the Schwarzschild models fitted to 
the observables of the triaxial Abel model for different intrinsic shapes. The 
contours denote 2, 4 (thick line) 8 and 32 sigma confidence levels. Areas 
for which the MGE cannot be deprojected are left blank. The six upper-left 
panel show the intrinsic shape parameters (p, q, u) and mass-to-light ratio 
M/L; the three lower right panels show the viewing angles ip, i/>). The 
combination of 1? and ip is shown in a lambert equal area projection, seen 
down the north pole (z-axis). The x, y and z symbols give the location of 
views down those axis. The red diamond in each panel indicates the input 
parameters from the Abel model. 



(ti,<p,ip) = (70°, 30°, 101°). As outlined in §[5711 the latter view- 
ing angles convert to the intrinsic shape parameters (p, q, u) = 
(0.82, 0.67, 0.88), given the average projected flattening q' = 0.76 
of (the MGE model of) the surface brightness. These input parame- 
ters are denoted by a red diamond in the contour plots of Fig. [6] We 
find that the input M/L and (p, q, u) (and hence also the viewing 
angles) of the Abel model are accurately recovered, with a typical 
uncertainty of 10 % or less. 

Krajnovic et al. (2005) suggested that the recovery of the incli- 
nation for axisymmetric models is degenerate, which seems in con- 
flict with our recovery of the intrinsic shape. However, the kinemat- 
ics of the Abel model have a significant feature, namely a orthog- 
onal decoupled core, and this makes it plausible that the viewing 
angles are constrained quite strongly. We verified that for galaxies 
with no such distinguished kinematic feature, e.g., in the case of a 
(nearly) zero mean velocity map like for M87, the intrinsic shape 
is not well constrained. 



7 APPLICATION TO NGC 4365 

We now apply our method to NGC 4365, one of the prototypical 
galaxies with a kinematically decoupled core (KDC). It is a gi- 
ant E3 elliptical and it was one of the first objects in which minor 
axis rotation was discovered (Wagner, Bender & Moellenhoff 1988; 
Bender, Saglia, & Gerhard 1994). The peculiar velocity structure 
of this galaxy was partially unraveled by multiple long-slit obser- 
vations (Surma & Bender 1995), but the full two-dimensional kine- 
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Figure 5. Schwarzschild models with different intrinsic shapes fitted to the observables of the triaxial Abel model from vdV07, reveal that the kinematics vary 
significantly by changing the axis ratios p and q. The panels on the left show the velocity field of the Schwarzschild model for each value for p and q, while the 
input velocity field from the Abel model is shown in the top left corner. The panels on the right shows the same for the velocity dispersion. The input model 
has axis ratios (p, q) = (0.82, 0.67) 



matical structure was only revealed with the integral-field spectro- 
graph SAURON (Davies et al. 2001). 

Kinematically decoupled cores can be the result of a merger 
event, but can also occur when the galaxy is triaxial and supports 
different orbital types in the core and main body (Statler 1991). 
Davies et al. (2001) studied the first option and investigated the 
link between the kinematics and the line-strength distribution of 
NGC 4365. They found that the core and the main body are of sim- 
ilar age and that any mergers that led to the formation of the KDC 
must have occurred at least 12 Gyr ago, as otherwise younger stel- 
lar populations would have been detected. The orbital structure that 
supports the KDC and the main body cannot be observed directly 
and must be inferred from dynamical models. Statler et al. (2004) 
studied the viewing angles and triaxiality of the system using an ap- 
proach developed by Statler (1994a), which uses bayesian analysis 
to fit analytic solutions of the continuity equation to an observed ve- 
locity field. They found NGC 4365 to be strongly triaxial and seen 
almost along the long axis. The triaxial Schwarzschild method that 
was presented in the previous sections allows us to build compre- 
hensive dynamical models of this galaxy and investigate its intrinsic 
structure. 



7.1 Observations 

NGC 4365 was observed with SAURON on the nights of March 29 
and 30, 2000, for two different pointings, with an overlap in the 
central region. The exposures were combined and rebinned into a 
map with a slightly better spatial sampling (0.8 arcsec, compared 
to 0.94 arcsec for the individual lenslets) and a coverage of 33 x 
63 arcsec. Davies et al. (2001) give a full description of the obser- 
vations. 

To increase the signal-to-noise (S/N) ratio to sufficient lev- 



els for accurate determination of the kinematics, the datacube was 
spatially binned into 964 non-overlapping bins using the two- 
dimensional Voronoi binning of Cappellari & Copin (2003). A 
minimum S/N of 100 per spectral element was imposed. How- 
ever, many of the spectra have a much higher S/N value (up to 
~ 300), and more than one quarter of the spectral elements re- 
main unbinned. The stellar kinematics where extracted using the 
penalised pixel-fitting method (pPXF) of Cappellari & Emsellem 
(2004). For every Voronoi bin we extracted the velocity V, veloc- 
ity dispersion a and the higher-order GH moments hs and hi of 
the stellar LOS VD. 

The SAURON spectra have very high S/N so that the LOS VD 
can be reliably extracted from the data, however care has to be taken 
to minimize the effect of template mismatch, which dominates the 
error budget in the bright, high-S/7V central regions of the galaxy. 
To this end an accurate template was determined during the pPXF 
fit using the ~ 1000 stars of the MILES library (Sanchez-Blazquez 
et al. 2006), which span a large range of stellar atmospheric pa- 
rameters. Out of the MILES stars, only 14 are selected by pPXF 
to provide an accurate match to the observed average galaxy spec- 
trum, with an rms scatter in the residuals of only 0.17 %. From the 
observed residuals and Fig. B3 of Emsellem et al. (2004), we infer 
an upper limit of < 0.02 on the systematic error of the GH mo- 
ments, due to any remaining template mismatch. 

The re-reduced kinematics, shown in Fig. [7] are a significant 
improvement over the kinematics shown in Davies et al. (2001). 
They show a core in the inner ~ 6 arcsec that rotates around the 
minor axis. At larger radii the stars rotate around an axis offset 
by 82° , which is evidence that the system is intrinsically triaxial. 
The peak mean streaming velocities are 55 km s~ . The dispersion 
peaks at a value of ~ 260 km s _1 . 
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Figure 7. SAURON observations of NGC 4365. Top panels: from left to right: the mean velocity, velocity dispersion and GH moments /13 and of NGC 4365, 
as observed with the integral-field spectrograph SAURON. The pixels scale of the observations is 0'.'8. Middle panels: point symmetrized kinematics with 
respect to the galaxy center. Non symmetric deviations cannot be reproduced by a triaxial model anyway and the symmetrization guides the eye. Bottom 
panels: kinematic maps of the best-fit Schwarzschild model, obtained by adding the weighted contributions of the best-fitting set of orbits. The same color 
levels are used for both data and model. 



7.2 Mass model 

We used an HST/WFPC2/F814W image and a ground based image 
of NGC 4365 obtained with the 1.3m McGraw-Hill at the MDM 
observatory (from Falcon-Barroso et al., in prep.) to make an MGE 
(mass) model, using the software by Cappellari (2002). We ensure 
that the model is the roundest that is consistent with the observa- 
tions. This is done by setting a lower limit to the allowed projected 
flattening of 0.67 and an upper limit of 4.5° on the difference in 
position angle between the individual Gaussians. The modest dif- 
ference between the RMS-error of the free model (0.99 per cent) 
and the constrained MGE-model (1.02 per cent) suggests that these 
constraints do not lead to systematic errors in the mass model. The 
parameters of the MGE-model are given in Table [3] the surface 
brightness map and the MGE model fit are shown in Fig. [8] 

7.3 Dynamical models 

We calculate triaxial Schwarzschild models using orbit libraries of 
3 x 1176 orbits, 2352 of which are started in the (a;, z)-plane, the re- 
maining are dropped from the equipotential. We assume a distance 
of 23 Mpc for NGC 4365 (Mei et al. 2005). The assumed distance 
does not influence our conclusions about the internal structure of 
the galaxy, but lengths and masses scale linearly with the distance, 
while mass-to-light ratios are inversely proportional to the distance. 

A given triaxial model is determined by the mass-to-light ratio 
M/L, the shape parameters (p, q, u) — or equivalently, the view- 
ing angles ip, xjj) — and the mass M m of the central black hole. 
We fix the latter to M. = 3.6 x 10 s M Q , consistent with the black 
hole-sigma relation (Tremaine et al. 2002), as the SAURON obser- 
vations do not have enough spatial resolution to resolve the radius 
of influence of the black hole and therefore can not constrain the 
mass directly. Given a typical flattening of q' =0.74 of the Gaus- 
sians in the MGE model (Table[3}, we sample (p, q, it) linearly in 



Table 3. The parameters of the eleven Gaussians in the MGE fit to the 
combined HST/WFPC2/F814W and the ground-based image of NGC 4365. 
The colums give for each Gaussian respectively its number j, amplitude 
SBo = L/(2no-' 2 q'), dispersion a' , projected flattening q', and position 
angle offset Aip', as defined in eq. j4}- 



j log SB (L 0iJ pc- 2 ) log a' (arcsec) q' Alp' (°) 



1 


3.424 


-1.024 


0.800 


0.0 


2 


3.319 


-0.727 


0.800 


0.0 


3 


3.238 


-0.320 


0.800 


0.0 


4 


3.435 


-0.027 


0.670 


0.0 


5 


3.820 


0.138 


0.709 


0.5 


6 


3.740 


0.402 


0.698 


0.8 


7 


3.576 


0.648 


0.798 


0.0 


8 


3.106 


0.955 


0.737 


0.0 


9 


2.874 


1.224 


0.739 


0.0 


10 


2.400 


1.499 


0.741 


3.5 


11 


2.122 


1.833 


0.775 
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steps of 0.06. This results in 96 different intrinsic shapes which we 
combine with M/L values sampled linearly in 11 steps, from 3.0 
to 5.0 in solar units. This results in a total of 1056 Schwarzschild 
models, for each of which we compute the goodness-of-fit to the 
observations via x 2 ■ The resulting models show a smooth gradient 
in x 2 as can be seen in the A% 2 contours in Fig.QJj] As before (see 
ij |6.2t . to avoid an incomplete sampling in viewing angle space, we 
oversample in (p, q, u) before computing the correspinding A^ 2 
contours in ip, 
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Figure 8. Top: The contours of HST/WFPC2/F814W image of NGC 4365, 
overplotted with (smooth) contours of the best-fitting MGE. Bottom: The 
contours of the ground based image of NGC 4365 obtained with the 1.3m 
McGraw-Hill telescope and the best-fitting MGE. 



7.4 Best-fit model 

The best-fit model (the model with the lowest \ 2 ) is shown in 
Fig-El T° be able to estimate the quality of the fit, Fig.[9]shows the 
residuals per velocity moment. Each panel shows a moment sorted 
in order of increasing value. The observed moment is indicated by 
the black line, with errors shown as black dots. The red dots rep- 
resent the corresponding values from the best-fit model, which is 
in good agreement with the data. The corresponding x 2 = 4295, 
which implies a \ 2 P er degree-of-freedom (DOF) of 1.1. Statisti- 
cally the standard deviation of the \ 2 is ^/2(N ^ S — N par ). Given 
that we have 3856 observables in our model and less than 10 param- 
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Figure 9. The panels show the kinematics of the best-fitting model of 
NGC 4365. The line (with error bounds, shown as black dots) are the 
(sorted) observations, while the red dots show the model predictions. 
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Figure 10. Marginalized contours maps of the models of NGC 4365 for dif- 
ferent mass-to-light ratios, shape parameters and (corresponding) viewing 
angles. The layout is identical to that of Fig. [6] See ij |7.4l for discussion. 
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eters (N pa , T ), the expected scatter in the x 2 is much larger than the 
customary A^ 2 = 1 criterion. We therefore treat A\ 2 < \/27Vob7 
as a one sigma (68%) result. The best-fit parameters are then 
M/L = (3.5 ± 0.2) M0/L0 (in J-band) and the intrinsic shape 
parameters (p,q,u) = (0.97tg:jg, O.TOtglg, 0-^-om)- 

The best-fit M/L is just consistent with the value of (4.3 ± 
0.4) Mq/Lq predicted by the M/L-a relation derived from ax- 
isymmetric models by Cappellari et al. (2006), using a <r e = 
231 km s" 1 for this galax}Q It is interesting to see that the best- 
fitting M/ L does not vary significantly with any of the other model 
parameters as can be seen in the A% 2 contour plots in Figs.QJJ] For 
example, even for models with a non-optimal value of p, q or u, still 
the best-fit M/L ~ 3.5Mq/Lq. The total mass of the model, ob- 
tained by converting the total luminosity to mass using the best-fit 
M/L value is 4.8 x 10 11 M . 

7.5 Intrinsic shape 

In Fig. [TJJ we show the intrinsic shape parameters as a function 
of radius of our best-fit model. Inside the central 35", where we 
have the kinematic observations, the shape of NGC 4365 is fairly 
oblate, with p = £ ;> 0.95, 0.65 < q = £ < 0.75 and the 
triaxiality parameter T = (1 — p 2 )/(l — q 2 ) < 0.2. Further out 
- outside of the area where observed kinematics are available - the 
reconstructed density becomes more prolate. This is caused by the 
drop in p to ~ 0.85, while q stays approximately the same. As a 
result the triaxiality T rises with increasing radius to ~ 0.6. 

The axis ratios found by Statler et al. (2004) using velocity 
field fitting of the SAURON data are (p) ~ 0.84, (q) ~ 0.60 which 
are just outside our 99 % confidence, and do not agree with their 
claim of strong triaxiality ((T) ~ 0.45) inside 35" of the galaxy. 
Their lower limit on the triaxiality is not consistent with our mea- 
surement in the center. We return to this below. 

The corresponding best-fit viewing angles are tp, ip) = 
(68°, 73°, 91°). To give an indication of the uncertainty of these 
values we show a Lambert azimuthal equal area projection of ■d and 
tp in Fig. [TO] This is comparable to a similar projection in Statler 
et al. (2004, their Fig. 4). Our best-fit viewing angle is also not 
consistent with theirs. The velocity field fitting (VF) method does 
not include the dispersion and other higher-order velocity moments 
and assumes a plausible solution of the continuity equation, while 
our Schwarzschild method does fit higher moments up to al least 
hn and does not enforce any constraints on the DF. 

The effect of the higher moments on the inferred shape can 
be studied by making models without them. We find that exclud- 
ing hi has no significant effect on the recovered shape. Remov- 
ing h-i and/or lower moments h-2 and hi (representing the disper- 
sion and mean velocity) changes the best-fit shape significantly 
and the reconstructed LOSVD of these models show significant 
deviations from Gaussian. These models are therefore not a good 
representation of the observed LOSVD. This tests show that the 
Schwarzschild models need at least /13 to accurately recover the 
inferred shape and observed LOSVD. 

To compare our modeling to the VF method directly, we made 
models where we only fit the mean velocity. We find that these 
models are completely degenerate and no minimum could be found. 
Instead of our least squares approach a likelihood method could 

2 The velocity dispersion a e is derived from the SAURON kinematics by 
luminosity-weighting all the spectra within one effective (half-light) radius 
and fitting a single Gaussian LOSVD. 
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Figure 11. Intrinsic axis ratios of the density distribution of the best-fit 
model of NGC 4365 as a function of radius. The inner 35" is nearly oblate 
axisymmetric (p > 0.95), whereas the outer part is much more triaxial 
(T > 0.2). The kinematic observations extend to ~ 35". 

be used to find a solution in the probabilistic sense, however such 
methods are unpractical for Schwarzschild models, because of the 
many iterations required. The VF method does use Bayesian anal- 
ysis, with a prior on the dispersion. 

For NGC 4365 to be a pure long-axis rotator with a KDC that 
consists purely of short-axis tubes, we expect a best-fitting mis- 
alignment angle ip that coincides with the observed kinematic mis- 
alignment of 82° ± 2° (or the symmetric 98°). In fact, all the mod- 
els with \tp — 90| > 5°are strongly ruled out and we conclude that 
NGC 4365 is not consistent with a pure long-axis rotator. 

7.6 The orbital structure 

The observed kinematics that go out to ~ 35" show that NGC 4365 
must be intrinsically triaxial, due to the kinematically decoupled 
core and the misaligned large scale rotation. This seems to be in 
conflict with the (nearly) oblate axisymmetric shape inside 30" of 
our best-fit model (Fig. II It. as this shape does not support the ro- 
tation around the major axis seen in the observed kinematics. 

Fig.ll2lshows the cumulative mass per orbit-type as a function 
of intrinsic radius. As expected from the shape in the inner region 
the stars on short axis tube orbits are dominant, accounting for 75% 
of the mass inside 30". The stars on long axis tubes become signif- 
icant in the model only outside 30". 

To understand the rotation seen in the observations we look 
at the balance of stars on prograde and retrograde orbits, shown in 
Fig-CH It shows that the stars in the KDC are on prograde short axis 
orbits inside 6", while up to 30" the stars on short axis orbits do not 
have a preferred rotation direction. Only 15% procent of the stars 
inside 30" move on long axis orbits, but nearly all of them move in 
the retrograde direction, and thus contribute to the observed mean 
velocity. 

To make the link between the orbital structure and the observa- 
tions, we show the unbinned kinematics of the stars on each type of 
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Figure 12. Properties of the orbital distribution. Left: Balance of prograde 
and retrograde rotation as a function of radius. The balance is a fraction 
of the total mass at that radius. The black vertical line represents the radius 
beyond which we do not have kinematic observations from SAURON. Right: 
Cumulative fraction of orbit type as function of their start radius. 

orbit individually, extrapolated over a region larger than the orginal 
observations in Fig. [13] The mass fraction, velocity and dispersion 
field are shown for (i) all stars, (u) stars on prograde short axis 
orbits, (Hi) stars on retrograde short axis orbits, (iv) stars on pro- 
grade and retrograde short axis orbits combined, and (v) stars on 
long axis and box orbits combined. 

In the decomposition of Fig. [T3] the stars on prograde and 
retrogrode short axis orbits have large velocities (jtWzj > 150 
kms -1 , \a\ ~ 160 kms -1 ) in opposite directions and the com- 
bination of them lead to a velocity field with very little rotation 
OiWij < 60 kms" 1 ) and a high dispersion field (\a\ ~ 220 
km s -1 ). The stars on long axis orbits also rotate quickly (\v ma x\ > 
150 km s" 1 ) and although they contribute only ~ 20% of the mass, 
the large velocity adds significantly to the large scale rotation seen 
in the observations. 

Two other galaxies NGC 4550 and NGC 4473 show very simi- 
lar features to NGC 4365, both have unusually high sigma along the 
major axis (Emsellem et al. 2004). They both consist of two coun- 
terrotating disks, similar to the two main components in NGC 4365: 
the prograde and retrograde short axis orbits (Rix et al. 1992; Cap- 
pellari et al. 2007). 

The KDC is not very distinct in the decomposition (Fig. 1 1 3 1 > 
and is perhaps only an appearance. The only way to really disen- 
tangle the KDC is through the unbalance of the central stars on 
prograde short axis orbits (Fig.ll2b. However, in the orbital weights 
(Fig.l 14b the KDC seems an integral part of all the stars on prograde 
short axis tube orbits as these stars are continuously distributed in 
orbital weights. It is therefore difficult to see the KDC as a distinct 
kinematic component. 



8 DISCUSSION AND CONCLUSION 

We have presented a flexible method to build dynamical models of 
triaxial early-type galaxies, allowing for position angle twist and 
ellipticity variation in their surface brightness, as well as a central 
supermassive black hole. The method is based on Schwarzschild's 
orbit superposition technique and uses the observed surface bright- 
ness distribution and observed kinematics to make a triaxial model 
of the observed galaxy. Our models can be constrained by observa- 
tions of the full line-of-sight velocity distribution. 

We discussed tests of our method on the triaxial Abel model 
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Figure 13. Extrapolated kinematics of the model of NGC 4365, showing 
(from top to bottom) the contribution from all stars, from stars on prograde, 
retrograde and combined short axis orbits, and from stars on long axis and 
box orbits. The contours show equal flux and the box shows the rectangle 
inside the SAURON observations. 

with a separable potential from vdV07, which illustrates the accu- 
racy of our orbit classification method and the effect of regularisa- 
tion. Tests with the viewing angles showed that we can constrain 
the intrinsic shape of galaxies with significant structure in their ve- 
locity field. 

We also presented results of an application on two- 
dimensional data of the E3 galaxy NGC 4365, obtained with the 
integral-field spectrograph SAURON. We showed that our method 
is capable of reproducing the main observational features to within 
the errors. We found a best-fitting (J-band) stellar mass-to-light 
ratio of 3.5 ± 0.2 in solar units and best-fitting viewing angles 
of ( , &,ip,ij))= (68°, 73°, 91°). The characteristic axis ratios are 
p ^ 0.95, 0.65 < q < 0.75 inside 35". By applying a simple reg- 
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ularisation scheme, we were able to determine the distribution of 
orbital weights, which provided us with a view on the orbit struc- 
ture of this galaxy. We find the inner part to be nearly oblate ax- 
isymmetric, with most of the stars equally divided on prograde and 
retrograde short axis tube orbits. Further out the galaxy becomes 
more triaxial, and the stars orbit on both long axis and short axis 
tubes. The KDC seen in the observations is not dynamically dis- 
tinct from the main body of the galaxy. More evidence for the idea 
that the 'decoupled core' is part of the main body of the galaxy, 
comes from the stellar ages determined by Davies et al. (2001). 
The ages of the stars where determined to be at least ~ 12 Gyr and 
they do not show a strong dependence which radius. Overall our 
orbital structure is consistent with the results from Statler (1991) 
and Arnold et al. (1994). 

An important consideration is the stability of triaxial galaxies. 
A significant fraction of the centrophilic box orbits can become 
chaotic in the presence of a central cusp or a supermassive black 
hole (Gerhard & Binney 1985; Valluri & Merritt 1998). As box 
orbits are crucial for supporting the triaxial shape, it is not evi- 
dent whether a triaxial object with a central black hole can retain 
its shape over a Hubble time (Lake & Norman 1983). Earlier N- 
body simulations of triaxial galaxies in which a central mass con- 
centration is grown indeed show a fairly rapid evolution towards 
a rounder shape in the inner parts (e.g. Merritt & Quinlan 1998; 
Valluri & Merritt 1998), but these results were challenged recently 
(Holley-Bockelmann et al. 2002; Poon & Merritt 2002). The intrin- 
sic shape of our best-fit model is nearly oblate axisymmetric in the 
centre, and more triaxial further out. This might be a sign of evolu- 
tion towards a axisymmetric shape, induced by the central massive 
black hole from the inside out. Clearly, we need to obtain a bet- 
ter understanding of whether triaxial galaxies can reach stationary 
equilibrium and if not, what the time-scale of the transition toward 
a nearly spheroidal shape is. 

The extension from an axisymmetric to a triaxial implementa- 
tion of Schwarzschild's method opens up a wide range of applica- 
tions. For example, while the mass of the central black hole seems 
to be correlated with other properties of the galaxy (Ferrarese & 
Merritt 2000; Gebhardt et al. 2000; Tremaine et al. 2002), nearly all 
black hole estimates were derived with dynamical models of (edge- 
on) axisymmetric models. In such models, the increase in line-of- 
sight motion towards the center that is seen in high-resolution ob- 
servations of many nearby galaxies can indeed only be explained 
by a black hole. In triaxial systems, box orbits provide an alterna- 
tive way of creating motion along the line-of-sight (depending on 
the viewing angles, see e.g., Gerhard 1988). As a result, a given 
observed velocity profile may require a different black hole mass 
when the galaxy is allowed to be triaxial, which in turn may influ- 
ence the black hole mass correlations if the intrinsic shape corre- 
lates with luminosity, as has been suggested. 

Another area of interest is the validation of predictions on the 
halo shapes from galaxy merger (e.g., Wechsler et al. 2002) and 
ACDM cosmology simulations (e.g., Navarro & Steinmetz 2000). 
They predict the existence of strongly triaxial halos as a result of 
merging. To confirm these simulations, our method can be used 
to measure the halo shapes of a representative sample of galaxies, 
using kinematical observation at large radii, where the halo mass 
dominates. 

Our method can be further extended in a number of aspects. 
For example, we assumed that the galaxy as a whole is non- 
rotating. The reason for this is that inclusion of figure rotation fur- 
ther complicates matters (Schwarzschild 1982; Heisler, Merritt & 
Schwarzschild 1982), while it may not be crucial for the modeling 



of existing observations of giant elliptical galaxies. The fitting of 
kinematics can be improved by fitting the LOSVD directly or by 
using eigen-velocity profiles (Houghton et al. 2006). Additionally, 
the method can be enhanced to include line strength information 
and multiple stellar populations, to study the distribution of stellar 
ages and metallicities within galaxies. Even without additional ex- 
tensions, the triaxial Schwarzschild method allows us to investigate 
the intrinsic structure and orbital make-up of early-type galaxies. 
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